Developing Standards for Improved Data Quality and for Selecting Fit for Use Biodiversity Data
Bookreader Item Preview
Share or Embed This Item
texts
Developing Standards for Improved Data Quality and for Selecting Fit for Use Biodiversity Data
- by
- Chapman, Arthur; Belbin, Lee; Zermoglio, Paula; Wieczorek, John; Morris, Paul; Nicholls, Miles; Rees, Emily Rose; Veiga, Allan; Thompson, Alexander; Saraiva, Antonio; James, Shelley; Gendreau, Christian; Benson, Abigail; Schigel, Dmitry
- Publication date
- 2020-3-20
- Usage
- Attribution 4.0 International


- Topics
- data quality, profile, framework, fitness for use, standards, tests and assertions, data quality tests, vocabularies, Darwin Core, GBIF
- Publisher
- Pensoft Publishers
- Collection
- biodiversity
- Contributor
- Pensoft Publishers
- Language
- English
- Rights
- https://biodiversitylibrary.org/permissions
- Rights-holder
- Copyright held by individual article author(s).
- Volume
- 4
- Item Size
- 43.3M
- Abstract
-
The quality of biodiversity data publicly accessible via aggregators such as GBIF (Global Biodiversity Information Facility), the ALA (Atlas of Living Australia), iDigBio (Integrated Digitized Biocollections), and OBIS (Ocean Biogeographic Information System) is often questioned, especially by the research community.
The Data Quality Interest Group, established by Biodiversity Information Standards (TDWG) and GBIF, has been engaged in four main activities: developing a framework for the assessment and management of data quality using a fitness for use approach; defining a core set of standardised tests and associated assertions based on Darwin Core terms; gathering and classifying user stories to form contextual-themed use cases, such as species distribution modelling, agrobiodiversity, and invasive species; and developing a standardised format for building and managing controlled vocabularies of values.
Using the developed framework, data quality profiles have been built from use cases to represent user needs. Quality assertions can then be used to filter data suitable for a purpose. The assertions can also be used to provide feedback to data providers and custodians to assist in improving data quality at the source. A case study, using two different implementations of tests and assertions based around the Darwin Core 'Event Date' terms, were also tested against GBIF data, to demonstrate that the tests are implementation agnostic, can be run on large aggregated datasets, and can make biodiversity data more fit for typical research uses. - Addeddate
- 2025-06-09 23:30:39
- Bhl_virtual_titleid
- 210882
- Bhl_virtual_volume
- v.4 (2020)
- Call number
- 10_3897_biss_4_50889
- Call-number
- 10_3897_biss_4_50889
- Foldoutcount
- 0
- Genre
- article
- Identifier
- developingstand4chap
- Identifier-ark
- ark:/13960/s2jz2bczdv2
- Identifier-bib
- 10_3897_biss_4_50889
- Identifier-doi
- 10.3897/biss.4.50889
- Ocr
- tesseract 5.3.0-6-g76ae
- Ocr_detected_lang
- en
- Ocr_detected_lang_conf
- 1.0000
- Ocr_detected_script
- Latin
- Ocr_detected_script_conf
- 0.9959
- Ocr_module_version
- 0.0.21
- Ocr_parameters
- -l eng
- Page_number_confidence
- 87
- Page_number_module_version
- 1.0.5
- Page_range
- 1-47
- Pages
- 47
- Pdf_degraded
- invalid-jp2-headers
- Pdf_module_version
- 0.0.25
- Possible copyright status
- In copyright. Digitized with the permission of the rights holder.
- Ppi
- 300
- Source
- Biodiversity Information Science and Standards 4
- Year
- 2020
comment
Reviews
58 Views
DOWNLOAD OPTIONS
For users with print-disabilities
IN COLLECTIONS
Biodiversity Heritage LibraryUploaded by Smithsonian Libraries and Archives on
Open Library